Title Speech - Rate - Variable HMM - Based Japanese TTS System
نویسندگان
چکیده
This paper proposes a new method for controlling phoneme duration according to arbitrary target speech rate in speech synthesis (TTS, text-to-speech) systems. The proposed method first constructs three fundamental duration models at “fast”, “normal”, and “slow” speech rates using Hayashi’s Quantification Theory (Type 1) based on real speech databases and creates a duration model according to a target speech rate by interpolating the fundamental models. Our TTS system uses an HMM-based synthesizer which can achieve flexible prosody control. Various speech synthesized by the proposed method are evaluated by subjective experiments at four speech rates using pair comparison tests between the proposed method and a rule-based method. The results show that the proposed method achieves higher naturalness in synthesized speech than the rule-based method.
منابع مشابه
Speech-rate-variable Hmm-based Japanese Tts System
This paper proposes a new method for controlling phoneme duration according to arbitrary target speech rate in speech synthesis (TTS, text-to-speech) systems. The proposed method first constructs three fundamental duration models at “fast”, “normal”, and “slow” speech rates using Hayashi’s Quantification Theory (Type 1) based on real speech databases and creates a duration model according to a ...
متن کاملVariable Speech Rate Mandarin Chinese Text-to-Speech System
This paper presents an Hidden Markov Model (HMM)-based variable speech rate Mandarin Chinese text-to-speech (TTS) system. In this system, parameters of spectrum, fundametal frequency and state duration are generated by a context dependent HMM (CDHMM) whose model parameters are linear-interpolated from those of three CDHMMs trained by corpora in three different speech rates (SRs), i.e. fast, med...
متن کاملXIMERA: a new TTS from ATR based on corpus-based technologies
This paper describes a new concatenative TTS system under development at ATR. The system, named XIMERA, is based on corpus-based technologies, as was the case for the preceding TTS systems from ATR, namely ν-talk and CHATR. The prominent features of XIMERA are (1) large corpora (a 110hours corpus of a Japanese male, a 60-hours corpus of a Japanese female, and a 20-hours corpus of a Chinese fema...
متن کاملPerformance Analysis of Text To Speech Synthesis System Using HMM And Prosody Features With Parsing For Tamil Language
This paper describes a Hidden Markov Model (HMM) based (TTS) system and prosody based (TTS) system for producing natural sounding synthetic speech in Tamil language. The (HMM) based system consists of two phases such as training and synthesis. Tamil speech is first parameterized into spectral and excitation features using Glottal Inverse Filtering (GIF). An emotions present in the input text is...
متن کاملDevelopment of HMM-based Malay Text-to-Speech System
This paper presents the development of a hidden Markov model (HMM)-based Malay text-to-speech (TTS) system. To our knowledge, this is the first report on the development of the HMM-based speech synthesis system for the Malay language. In this paper, We first discuss the Malay speech characteristics, specifically, on Malay phonological system and syllable structure. In the Malay phonological sys...
متن کامل